A nearest-neighboring-end algorithm for genetic mapping
نویسندگان
چکیده
MOTIVATION High-throughput methods are beginning to make possible the genotyping of thousands of loci in thousands of individuals, which could be useful for tightly associating phenotypes to candidate loci. Current mapping algorithms cannot handle so many data without building hierarchies of framework maps. RESULTS A version of Kruskal's minimum spanning tree algorithm can solve any genetic mapping problem that can be stated as marker deletion from a set of linkage groups. These include backcross, recombinant inbred, haploid and double-cross recombinational populations, in addition to conventional deletion and radiation hybrid populations. The algorithm progressively joins linkage groups at increasing recombination fractions between terminal markers, and attempts to recognize and correct erroneous joins at peaks in recombination fraction. The algorithm is O (mn3) for m individuals and n markers, but the mean run time scales close to mn2. It is amenable to parallel processing and has recovered true map order in simulations of large backcross, recombinant inbred and deletion populations with up to 37,005 markers. Simulations were used to investigate map accuracy in response to population size, allelic dominance, segregation distortion, missing data and random typing errors. It produced accurate maps when marker distribution was sufficiently uniform, although segregation distortion could induce translocated marker orders. The algorithm was also used to map 1003 loci in the F7 ITMI population of bread wheat, Triticum aestivum L. emend Thell., where it shortened an existing standard map by 16%, but it failed to associate blocks of markers properly across gaps within linkage groups. This was because it depends upon the rankings of recombination fractions at individual markers, and is susceptible to sampling error, typing error and joint selection involving the terminal markers of nearly finished linkage groups. Therefore, the current form of the algorithm is useful mainly to improve local marker ordering in linkage groups obtained in other ways. AVAILABILITY The source code and supplemental data are http://www.iubio.bio.indiana.edu/soft/molbio/qtl/flipper/ CONTACT [email protected].
منابع مشابه
An Algorithm for Predicting Recurrence of Breast Cancer Using Genetic Algorithm and Nearest Neighbor Algorithm
Introduction: Breast cancer is one of the most common types of cancer and the most common type of malignancy in women, which has been growing in recent years. Patients with this disease have a chance of recurrence. Many factors reduce or increase this probability. Data mining is one of the methods used to detect or anticipate cancers, and one of its most common uses is to predict the recurrence...
متن کاملAn Algorithm for Predicting Recurrence of Breast Cancer Using Genetic Algorithm and Nearest Neighbor Algorithm
Introduction: Breast cancer is one of the most common types of cancer and the most common type of malignancy in women, which has been growing in recent years. Patients with this disease have a chance of recurrence. Many factors reduce or increase this probability. Data mining is one of the methods used to detect or anticipate cancers, and one of its most common uses is to predict the recurrence...
متن کاملA New Method for Color Gamut Mapping by Genetic Algorithm
To reproduce an image, it is necessary to map out of gamut colors of the image to destination gamut. It is clear that the best color gamut mapping introduces the perceptually closest image to the original one. In this study, a new color gamut mapping is purposed by the aid of Genetic Algorithm (GA). The color difference between the original and mapped images based on S-LAB formula was chosen as...
متن کاملDiabetes Prediction by Optimizing the Nearest Neighbor Algorithm Using Genetic Algorithm
Introduction: Diabetes or diabetes mellitus is a metabolic disorder in body when the body does not produce insulin, and produced insulin cannot function normally. The presence of various signs and symptoms of this disease makes it difficult for doctors to diagnose. Data mining allows analysis of patients’ clinical data for medical decision making. The aim of this study was to provide a model fo...
متن کاملDiabetes Prediction by Optimizing the Nearest Neighbor Algorithm Using Genetic Algorithm
Introduction: Diabetes or diabetes mellitus is a metabolic disorder in body when the body does not produce insulin, and produced insulin cannot function normally. The presence of various signs and symptoms of this disease makes it difficult for doctors to diagnose. Data mining allows analysis of patients’ clinical data for medical decision making. The aim of this study was to provide a model fo...
متن کاملA New Hybrid Routing Algorithm based on Genetic Algorithm and Simulated Annealing for Vehicular Ad hoc Networks
In recent years, Vehicular Ad-hoc Networks (VANET) as an emerging technology have tried to reduce road damage and car accidents through intelligent traffic controlling. In these networks, the rapid movement of vehicles, topology dynamics, and the limitations of network resources engender critical challenges in the routing process. Therefore, providing a stable and reliable routing algorithm is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 21 8 شماره
صفحات -
تاریخ انتشار 2005